Mining E-mail Authorship

نویسنده

  • Olivier de Vel
چکیده

In this paper we report an investigation into the learning of authorship identification or categorisation for the case of e-mail documents. We use various e-mail document features such as structural characteristics and linguistic evidence together with the Support Vector Machine as the learning algorithm. Experiments on a number of e-mail documents give promising results with some e-mail document features and author categories giving better categorisation performance results.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Novel Approach of Mining Write-Prints for Authorship Attribution in E-mail Forensics

There is an alarming increase in the number of cybercrime incidents through anonymous e-mails. The problem of e-mail authorship attribution is to identify the most plausible author of an anonymous e-mail from a group of potential suspects. Most previous contributions employed a traditional classification approach, such as decision tree and Support Vector Machine (SVM), to identify the author an...

متن کامل

Gender-Preferential Text Mining of E-mail Discourse

This paper describes an investigation of authorship gender attribution mining from e-mail text documents. We used an extended set of predominantly topic content-free e-mail document features such as style markers, structural characteristics and gender-preferential language features together with a Support Vector Machine learning algorithm. Experiments using a corpus of e-mail documents generate...

متن کامل

E-mail authorship attribution using customized associative classification

E-mail communication is often abused for conducting social engineering attacks including spamming, phishing, identity theft and for distributing malware. This is largely attributed to the problem of anonymity inherent in the standard electronic mail protocol. In the literature, authorship attribution is studied as a text categorization problem where the writing styles of individuals are modeled...

متن کامل

Language and Gender Author Cohort Analysis of E-mail for Computer Forensics

We describe an investigation of authorship gender and language background cohort attribution mining from e-mail text documents. We used an extended set of predominantly topic content-free e-mail document features such as style markers, structural characteristics and gender-preferential language features together with a Support Vector Machine learning algorithm. Experiments using a corpus of e-m...

متن کامل

Visualizing IKAT Co-Authorship Networks by Text Mining MICC-IKAT Annual Reports 1993–2005

This short paper presents two experiments that address visualization of correlated authors which have been obtained through text mining IKAT annual reports. By generating a 2-dimensional topology that reflects the structure underlying the extracted features, a person investigating this information can oversee the information more efficiently. Similar techniques can be used in financial fraud in...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000